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ABSTRACT 



Multivariate normality is required for some statistical tests. A 
graphical procedure for evaluating multivariate normality is 
illustrated. The logic for using the multivariate bootstrap is 
presented. The multivariate bootstrap can be used when distribution 
assumptions are not met, or for descriptive purposes in all cases. 
The multivariate bootstrap logic is illustrated for the canonical 
correlation case. Various software for conducting multivariate 
bootstrap analyses is described and cited. 
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Researchers have increasingly recognized that multivariate 
analyses are vital in the social sciences, for at least two reasons 
(Fish, 1988; Thompson, 1994c). First, use of multivariate methods 
avoids the inflation in experimentwise Type I error rates that 
occurs when univariate methods are employed in a single study to 
test multiple hypotheses that are at least partially uncorrelated. 
Second, and more importantly, multivariate methods can be employed 
to analytically honor a substantive reality in which most effects 
have multiple causes, and most effects have multiple consequences. 

For these reasons, multivariate methods are being employed 
with increasing frequency. For example, Emmons, Stallings and 
Layne (1990) studied 16 years of research reports in three 
journals, and found that 

the multivariate characteristic of the social 
science research environment with its many 
confounding or intervening variables has been 
addressed through the trend toward increased use of 
multivariate analysis of variance and covariance, 
multiple regression, and multiple correlation, (p. 

14) 

Similarly, Grimm and Yarnold (1995) recently noted that, ”In the 
last 20 years, the use of multivariate statistics has become 
commonplace. Indeed, it is difficult to find empirically based 
articles that do not use one or another multivariate analysis" (p. 
vii) . 

The purpose of the present paper is to explore the 




3 



implications of violating the assumption of multivariate normality 
that is required in some multivariate applications. Marascuilo and 
Levin (1983) nicely summed up several important features of the 
assumption: 

The multivariate normal distribution is somewhat 
hidden throughout multivariate methods. It is not 
required in the estimation and data description 
aspects of the theory. Its impact and role, however, 
are basic to the [statistical significance] 
inference procedures of multivariate analysis and it 
is hare that it must be assumed. There are no 
satisfactory tests of its truth in any one 
situation. 

Although multivariate normality is not required to estimate 
most multivariate parameters (e.g., function coefficients, 
structure coefficients) , even in these cases the distributions of 
the variables must be reasonably comparable. Multivariate 
parameters are estimated using the correlation or the 
variance/covariance matrix from the sample. As Thompson (1984, p. 
17) noted, therefore even aside from the required assumptions for 
statistical significance testing, 

...the magnitudes of the coefficients of the 
correlation [or covariance] matrix must not be 
attenuated by large differences in the shapes of the 
distributions for the variables. It is important to 
emphasize that. . . [parameter estimation usually] 
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does not require that the variables be normally 
distributed as long as there is no substantial 
attenuation associated with distribution 
differences, regardless of what these distributions 
may be. 

The present paper first reviews one method for estimating 
multivariate normality. Next, the use of the bootstrap in such 
situations is then explored. Finally, a heuristic example is 
presented. 

Evaluating Multivariate Normality 

It is important to note initially that evaluations of 
univariate and bivariate normality are not sufficient to establish 
that the assumption of multivariate normality has been met. As 
Thompson (1984, p. 18) noted, 

...examining the univariate or the bivariate 
distributions will not conclusively resolve this 
uncertainty. Multivariate distributions can be 
nonnormal even when all subsets of univariate or 
bivariate distributions are normal, just as a 
bivariate distribution may be nonnormal even when 
both the individual variables are distributed in a 
normal matter. 

One method of exploring whether the assumption of multivariate 
normality has been met involves a graphical procedure explained by 
Stevens (1986, pp. 207-212). Thompson (1990) provides a computer 
program that automates this procedure. Table 1 presents data used 
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to illustrate this approach- The heuristic example involves scores 
of 26 people on three variables. 



INSERT TABLE 1 ABOUT HERE 



Table 2 presents descriptive statistics for these data. Table 
3 presents the Mahalanobis distances (D^) of each of the score 
vectors for the 26 cases from the centroid, i.e., the Cartesian 
coordinate on each of the three variables associated with the means 
(6.40, 6.87, and 6.72, respectively). For example, case #8 had 
scores of 6.7, 6.0, and 7.2, which are close to the three means, 
respectively, thus resulting in the smallest value (0.694) for 
this case. 

INSERT TABLES 2 AND 3 ABOUT HERE 



In the graphical procedure these distances are sorted and 
associated chi-square and p values are computed, as illustrated in 
Table 4. Finally, the 26 pairs of chi-square and values are 
plotted, as illustrated in Figure 1. If a reasonably straight 
lines is defined within the plot, the data are taken to be 
multivariate normal. 



INSERT FIGURE 1 ABOUT HERE 



The Multivariate Bootstrap 
Uses of the Bootstrap 

As has been noted elsewhere. 
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The bootstrap can actually be used in two somewhat 
discrete ways. First, it can be used descriptively 
to evaluate whether results are reasonably stable 
over different configurations of subjects. . . Second, 
the bootstrap can be used inferentially , if we 
consult all four statistics from our resampling 
analyses, and use them to empirically construct 
study-specific test distributions or confidence 
intervals. This is only another approach to testing 
statistical significance. (Thompson, 1993, pp. 372- 
373) 

That is, when the assumption of multivariate normality cannot be 
met such that statistical significance testing cannot reasonably be 
conducted, one can use bootstrap methods to develop study-specific 
sampling distributions that can then be used in statistical tests. 

Of course, statistical significance tests are of extremely 
limited utility (Carver, 1978; Cohen, 1994; Thompson, 1993, 1994a, 
1996) . These tests do not evaluate (a) the value or (b) the 
replicability of our results. These inferential tests do not 
evaluate either the probability of population parameters or the 
probability of the statistics in future samples! 

But the bootstrap may still be valuable, even if we do not 
wish to perform statistical tests. The bootstrap may be used to 
explore the stability of parameter estimates when distributional 
problems may have attenuated certain relationships and consequently 
impacted the parameters estimated from these correlations or 
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covariances. Of course, the bootstrap may be useful in describing 
the replicability of results even when distributional assumptions 
are fully met. 

Logic of the Bootstrap 

The logic of the bootstrap has primarily been elaborated by 
Efron and his colleagues. Diaconis and Efron (1983) and Thompson 
(1994b) provide fairly accessible explanations. To make the 
present discussion concrete, let's presume that we had a sample of 
50 subjects' scores on two variables, and that we wanted to 
estimate the Pearson r between the two variables. We would 
initially compute this statistic for the sample. 

We would then draw a so-called random "resample” of 50 
subjects from our original sample. But the trick is that we draw 
the resample with replacement. This means that our first subject 
may not be drawn at all in this resample. But subject #2 might be 
drawn several times. Thus, the resample consists of a different 
configuration of 50 subjects (some used multiple times) than our 
original sample. We would then compute the r in our resample. 

When we randomly draw our resamples, we randomly select all 
the scores of each given resampled subject (i.e., in this case 
pairs of scores) . And the reason that we draw exactly 50 subjects 
(at least in what should be at least one of the resampling 
strategies that we use) is to honor the influences of sampling 
error involved with sampling exactly 50 subjects. 

Of course, what we can did once we could do a second time, by 
drawing a completely independent second random resample of 50 
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subjects' pairs of scores on the two variables. This would 
represent yet another configuration of the 50 original subjects' 
scores. We would then compute a second estimate of r. 

Over all of our resamples we could create a distribution of 
our estimates of £. This would be an empirically estimated 
sampling distribution, rather than the theoretically assumed 
sampling distribution employed in conventional statistical tests 
(Arnold, 1996) . And the standard deviation of the various resample 
parameter estimates is nothing less than an empirically estimated 
standard error of the statistic (i.e., SE,) . The ratio of the r in 
our original sample to this SE, behaves like a t statistic. 
However, another alternative is to employ the sampling distribution 
to compute a confidence interval about our estimate. 

It is conventional practice to resample at least 1,000 times, 
and 1,500 or 2,000 resamples would not be uncommon. More samples 
are especially important if our purpose is inferential (i.e., 
statistical significance testing) , because here the tails of the 
sampling distribution are the focus, and considerably more subjects 
are required to adequately estimate these parts of the sampling 
distribution. Alternatively, in the descriptive application, our 
focus is basically on the question, "if we mix up our subjects in 
a whole lot of ways, do we still get basically the same estimate, 
no matter what we do?". 

Obviously, from a practical point of view the bootstrap 
approach requires computer automation. Lunneborg (1987) has offered 
some excellent microcomputer programs that automate this logic for 



univariate applications. In fact, user-friendly PC bootstrap 
software has become available from publishers around the world. 
Examples of such software and the distributors of the software 
include: (a) "Resampling Stats", distributed by Resampling Stats, 
612 N. Jackson, Arlington, VA 22201; (b) "Statistical Calculator", 
distributed by Erlbaum, 27 Palmeira Mansions, Church Road, Hove 
East Sussex BN3 2FA, United Kingdom; (c) SPIDA, distributed on 
behalf of its Australian author by SERC, 1107 NE 45th — Suite 520, 
Seattle, WA 98105; and (d) the menu-driven program, BOJA, 
distributed by ieeProGAMMA, P.O. Box 841, 9700 AV Groningen, The 
Netherlands. 

A Multivariate Logic 

The use of the bootstrap in univariate applications is quite 
straightforward. However, in multivariate analyses which produce 
multiple sets of estimates (e.g., two discriminant functions, three 
factors) , special problems arise. In one resample a given factor 
may appear as the first function [equation, or factor] , but in 
another resample may arise as the same construct but as the second 
function [equation, or factor]. Such variations are usually not 
substantively relevant or troublirq, as long as the underlying 
constructs are invariant, but do create analytic problems. 

In bootstrap applications using structural equation modeling, 
the solution is quite straightforward: simply use the matrix 
declaring the fixed and freed parameters to define a common factor 
space across all resamples. But in classical multivariate methods 
a solution is not as obvious. 
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Thompson (1995) explained the problem and proposed a solution; 



The major barrier to conducting a multivariate 
bootstrap involves the multidimensional character of 
the "space” in which the analysis is conducted. The 
bootstrap must be applied such that each of the 
hundreds or thousands of resampling results are all 
located in a common factor space before the mean, 

SD, skewness and kurtosis are computed.... If the 
analyst computed mean structure (or pattern) 
coefficients for the first variable on the first 
component across all the repeated samplings, the 
mean would be a nonsensical mess representing an 
average of some apples, some oranges, and perhaps 
i ome kiwi. The sampled solutions must be rotated to 
best fit positions with a common target solution, 
prior to computing means and other statistics across 
the [re] samples, so that the results are reasonable. 

(pp. 88-89) 

In short, a "target" matrix is used to define a common factor 
space, and all resample results are rotated to best-fit position 
with this factor space using Procrustean rotation. Such 
applications can be generalized across classical parametric 
analyses, because all such analyses are special cases of canonical 
correlation analysis (Fan, 1992; Knapp, 1978; Thompson, 1984, 
1991) . 

Thompson (1988, 1992, 1995) provides software for multivariate 
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applications (factor analysis, descriptive discriminant analysis, 
and canonical correlation analysis, respectively) that all invoke 
this solution. Borrello and Thompson (1989) and Scott, Thompson, 
and Sexton (1989) are examples of applications of the multivariate 
bootstrap. 

Heuristic Example of the Multivariate Bootstrap 

An application of the canonical bootstrap program, CANSTRAP 
(Thompson, 1995) , is presented here as an illustration of the 
procedure. The illustration employs scores on six variables (i.e., 
four in one set, and two in the other set) from 50 cases from the 
Holzinger and Swineford (1939, pp. 81-91) data. These scores on 
ability batteries have classically been used as examples in both 
popular textbooks (Gorsuch, 1983, passim) and computer program 
manuals (Joreskog & Sorbom, 1989, pp. 97-104), and thus are 
familiar to many readers. 

Appendix A presents the program output for the data. First, 
canonical results are derived for the original sample. As reported 
in the appendix (p. 22 of the present paper) , the 6x6 correlation 
matrix is computed, and partitioned into the quadrants associated 
with the variable sets. Also as reported in the appendix ("matrix 
to be analyzed", p. 23), the so-called 2x2 "quadruple-product" 
matrix is then computed (cf . Thompson, 1984) . 

The eigenvalues from the principal components analysis of the 
quadruple-product matrix are the two squared canonical correlation 
coefficients for these data (see p. 23) . The two components are 
then used to compute canonical function coefficients (p. 23) , and 
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subsequently canonical structure coefficients (p. 24) (Thompson & 
Borrello, 1985) . The function coefficient matrix becomes the target 
matrix (p. 24) for Procrustean rotations for all the resamples. 

Appendix A (pp. 24-30) presents full results for both the 
first two resamples. As noted previously, in each resample a given 
subject may be drawn not at all, or once, or multiple times. For 
example, in resample #1 (pp. 24-27) person 18 was drawn twice (as 
the 1st person in the resample and as the 14th person in the 
resample) . 

In the present example, 1,000 resamples were drawn. CANSTRAP 
then presents a description of the resampling process, so that 
randomness can be confirmed. For example, it is noted (p. 31) that 
person 1 was drawn once in resample #1, once in resample #2, not at 
all in resample #3, and twice in resample #991. Across the 1,000 
resamples, person 1 was resampled 1,008 times (p. 31). The fewest 
times a person was resampled was 942; the most times was 1,056 (p. 
32) . 

Then the multivariate bootstrap results are presented. The 
mean on Function I was .43958 (p. 32) . The empirically estimated 
standard error of this statistic was .11744 (p. 32). Of course, 
when standard errors are empirically estimated, the standard errors 
may differ for different parameter estimates aven when two 
parameter escimateS are identical and sample size is a given fixed 
value. 

The appendix also presents the mean function and structure 
coefficients (pp. 32-34) for this analysis. For example, across 
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1,000 resamples the structure coefficients of variables 3 and 4 on 
Function I were roughly equal (+.5515 and +.5407, respectively), 
while the standard errors for these estimates (.6003 and .4675, 
respectively) were not as equal. In this case the smaller of these 
two parameter estimates was somewhat more stable across the various 
1,000 configurations of the original 50 subjects. 

Summary 

Multivariate normality is required for some statistical tests. 
A graphical procedure for evaluating multivariate normality was 
presented. The logic for using the multivariate bootstrap was 
presented. The multivariate bootstrap can be used when distribution 
assumptions are not met, or for descriptive purposes in all cases. 
The multivariate bootstrap logic was illustrated for the canonical 
correlation case. Various software for conducting multivariate 
bootstrap analyses was cited. 
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Table 1 

Data (n=26) From Stevens (1986, p. 209) Example 



1 


5.80000 


9.70C00 


8.90000 


2 


10.60000 


10.90000 


11.00000 


3 


8.60000 


7.20000 


8.70000 


4 


4.80000 


4.60000 


6.20000 


5 


8.30000 


10.60000 


7.80000 


6 


4.60000 


3.30000 


4.70000 


7 


4.80000 


3.70000 


6.40000 


8 


6.70000 


6.00000 


7.20000 


9 


7.10000 


8.40000 


8.40000 


10 


6.20000 


3.00000 


4.30000 


11 


4.20000 


5.30000 


4.20000 


12 


6.90000 


9.70000 


7.20000 


13 


5.60000 


4.10000 


4.30000 


14 


4.80000 


3.80000 


5.30000 


15 


2.90000 


3.70000 


4.20000 


16 


6.10000 


7.10000 


8.10000 


17 


12.50000 


11.20000 


8.90000 


18 


5.20000 


9.30000 


6.20000 


19 


5.70000 


10.30000 


5.50000 


20 


6.00000 


5.70000 


5.40000 


21 


5.20000 


7.70000 


6.90000 


22 


7.20000 


5.80000 


6.70000 


23 


8.10000 


7.10000 


8.10000 


24 


3.30000 


3.00000 


4.90000 


25 


7.60000 


7,70000 


6.20000 


26 


7.70000 


9.70000 


8.90000 



Table 2 

Descriptive Statistics for the Table 1 Data 



Means 

6.40385 6.86923 

Variance/Covariance Matrix 

1 4.52279 3.98212 

2 3.98212 7.41261 

3 2.94114 3.70049 

Inverted Variance/ Covariance 



1 

2 

3 



0.56740 

-0.12024 

-0.36973 



-0 . 12024 
0.33075 
-0.26292 



6.71538 

2.94114 
3.70049 
3 .31015 
Matrix 
-0.36973 
-0.26292 
0.92454 
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Table 3 

Mahalanobis Dis-cances for the 26 Cases 



1 


5.40434 


14 


1.28071 


2 


5.89352 


15 


2.75819 


3 


2.67146 


16 


2.00243 


4 


1.30686 


17 


10.53041 


5 


2.38155 


18 


3.92584 


6 


1.79599 


19 


7.68053 


7 


2.75163 


20 


0.82928 


8 


0.69408 


21 


1.40629 


9 


1.19431 


22 


0.94312 


10 


4.90097 


23 


1.42369 


11 


2.41353 


24 


2.71666 


12 


1.77029 


25 


1.72773 


13 


2.80861 


26 


1.78797 



Table 4 

Sorted and Associated chi-square and £ Values 





(df=3 and 


£ percentile 


i = 100(1 




D Sq 


chi sq 


P 


1 


0.69408 


0.17988 


0.01923 


2 


0.82928 


0.38996 


0.05769 


3 


0.94312 


0.56743 


0.09615 


4 


1.19431 


0.73313 


0 . 13462 


5 


1.28071 


0.89380 


0.17308 


6 


1.30686 


1.05287 


0.21154 


7 


1.40629 


1.21253 


0.25000 


8 


1.42369 


1.37444 


0.28846 


9 


1.72773 


1.53997 


0.32692 


10 


1.77029 


1.71044 


0.36538 


11 


1.78797 


1.88716 


0.40385 


12 


1.79599 


2.07154 


0.44231 


13 


2 . 00243 


2.26515 


0.48077 


14 


2.38155 


2.46983 


0.51923 


15 


2.41353 


2.68779 


0.55769 


16 


2.67146 


2.92176 


0.59615 


17 


2.71666 


3.17526 


0.63462 


18 


2.75163 


3.45290 


0.67308 


19 


2.75819 


3.76095 


0.71154 


20 


2.80861 


4.10835 


0.75000 


21 


3.92584 


4.50845 


0.78846 


22 


4.90097 


4.98259 


0.82692 


23 


5.40434 


5.56822 


0.86538 


24 


5.89352 


6.34088 


0.90385 


25 


7 . 68053 


7.49482 


0.94231 


26 


10.53041 


9.92311 


0.98077 
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Figure 1 

Scatterplot of and chi-square Statistics 
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